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Abstract 


Stock price prediction is an important and crucial 
domain for market investors due to its volatile trend. A 
reliable and accurate prediction model could assist 
investors in taking decisions accordingly. This paper 
presents a Python script that demonstrates the 
implementation of a deep learning based stock price 
prediction paradigm for the top four stocks of S & P 500 
utilizing the Keras library in TensorFlow. The script 
leverages historical stock price data obtained from 
Yahoo Finance to train and test the model. The historical 
stock price data is accessed by utilizing the Yahoo 
Finance API. The data is visualized using Matplotlib to 
demonstrate its evolution over time. Initially, the 
performance of two baseline regressors including the 
radial basis function and random forest have been 
assessed on the basis of standard performance metrics. 
To reduce the prediction error, deep learning 
mechanisms of long short term memory (LSTM) and 
bidirectional LSTM have been exploited. We report that 
LSTM performs much better among other regressors. 
The time series-based LSTM model has demonstrated 
consistent, reliable, and accurate predictions with lower 
RMSE and MAPE values. A look ahead table has also 
been generated for one-, three day- and seven day- ahead 
stock price predictions for the four stocks utilizing all 
four regressors. 


Keywords—Stock price prediction; S&P500; Stacked 
LSTM; Apple; Amazon; Microsoft; NVIDIA. 


1 Introduction 


Stock prices play a crucial role in the financial growth 
and stability of a country, and their accurate prediction 
is essential for making informed investment decisions. 
However, the stock market is subject to volatility and 
uncertainty, which can make it challenging to 
accurately forecast stock prices. Artificial neural 
networks (ANN) have been widely in understanding 
the hidden patterns of time series data [1]. Various 
ANN models such as BPNN, RBFNN, and RNN have 


been utilized. Other machine learning models, such as 
support vector regression (SVR), kernel regression, 
and random forest regression (RFR), have also been 
used for stock price prediction [2]. These models use 
non-parametric approaches and have shown promising 
results in reducing estimation errors [3]. Hybrid 
approaches such as LSTM-ARIMA [4], ANN-SVM 
[5], CNN-LSTM [6,7] have been used to forecast 
stock prices. 


We have worked on S&P 500 [8] to forecast short term 
stock prices. The prominent features of the scheme are 
summarized in terms of the highlights as follows: 


e An LSTM based deep neural network model is 
presented for short-term stock price prediction to 
the top 4 companies of S&P 500. 

e The suggested architecture of the Single LSTM 
prediction engine (SLPE) offers convenience in 
the development and implementation process for 
time-series datasets. This is attributed to its 
enhanced learning capability within the 
nonlinear feature space and its ability to 
generalize effectively. 

e LSTM unit utilizes the sigmoid activation 
function, offering advantages that facilitate 
accelerated convergence. 

e The efficiency, accuracy, and robustness of 
SLPE are endorsed by implementation on top 
four stocks of S&P 500 in terms of MAE, 
RMSE, R?Score, and MAPE_ performance 
metrics. 


The rest of the article is organized as follows: the 
description of the short-term prediction of stock price 
and the performance metrics are given in Section II. 
Section HI includes some of the existing prediction 
techniques as well as the proposed methodology while 
the conclusion is provided in the last section. 


2 Material and Methods 


In the present work, ML-based techniques have been 
implemented for stock price prediction on Yahoo 
Finance dataset. We have proposed a single LSTM 
prediction engine (SLPE) using sigmoid as a kernel 
function. The workflow schema of the proposed 
approach, SLPE, is given in Figure-1. 
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Figure-1: An illustrated summary of the suggested 
methodology, SLPE 


2.1. LSTM based Model for Stock Prediction 


In this section, we delve into LSTM networks, a 
specific type of RNN, as proposed by Hochreiter et al. 
[15]. Figure-3 illustrates a simple RNN with three 
inputs being fed to three neurons in a hidden layer, 
each equipped with an activation function denoted as 


H,, Hf ,, and H,. 


The output from the hidden layer is fed backward via 
passing through a conceptual delay /%~1 and 
connects at its hidden layer again [20]. However, the 
idea of LSTM was proposed to provide a solution for 
the vanishing gradient problem in RNN employing 
memory cells in each LSTM unit [21, 22]. 


An LSTM unit works in the same manner but in a 
distinct architecture. LSTM has three types of gate 
values called input (J), forget ( F' ), and output (O), 
which is described in Equation (5). W’, W* W? 


are weight vectors; U', U7” and U7 are the input, 
forget and output gates, respectively. 


I =2(U'X'+W'H" +b') 
F=z(U'X'+W'H''+b") (5) 


O= 2(UCXx' 4+W°H +b°) 
Equation (6) showcases the computation of the hidden 
state H’ in LSTM, following the same process as in 


RNN, utilizing the current input and the previous 
hidden state. 


of! (6) 


Figure-3: One hidden layer of LSTM 


In contrast to RNN, the approach in LSTM involves 
utilizing an input gate I to determine the portion of 
H' that should be preserved and stored in the memory 
cell (u’)- In Equation (7), the memory cell component, 
represented by (u‘), is introduced within the LSTM 


unit. 


(M°) =F'o(Me) +1 W'X'SWH SE (7) 
here ° shows the element-wise multiplication between 
the above-mentioned vectors. The mixture (u*) in 
Equation (8) comprises the previous memory state 
(um), the forget gate F', and the element-wise 


multiplication of the hidden state with the previous 
memory state ( My" and the input gate I. 


Consequently, this component represents a 
combination of the previous memory and the new 
input. The old memory can be completely forgotten 
(when the forget gate F’ equals all 0's), or the newly 
calculated hidden state can be disregarded (when the 
input gate I equals all 0's). However, in most cases, 
an intermediate condition is selected to prevent the 
complete vanishing of either one or both terms. By 
means of the gating mechanism, LSTM can effectively 
model long-term dependencies and regulate the 
behavior of memory by learning the parameters of the 
gates. 


2.2 S&P 500 stock Datasets 


The dataset used for stock price prediction includes ten 
years of historical data for four prominent stocks in the 
S&P 500 index: AAPL, MSFT, AMZN, and NVDA. 
It encompasses various factors and metrics related to 
stock prices and is regularly updated with current data 


[8]. 


The correlation matrix shows that the top seven most 
correlated variables to stock prediction are open, high, 
low, close, Adj close and Volume. The Figure-5 heat- 
map shows the most relevant attributes that can be 
exploited for accurate prediction of stock price and 
performance of the model (due to a reduction in the 
number of attributes). Eighty percent (80%) of the data 
is passed to models for training and twenty percent 
(20%) for testing. 
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Figure-5: Heat map of the correlation matrix for 
AMZN 


2.3 Performance’ Evaluation of Stock 
Predictors 

Performance indices are employed in terms of RMSE, 
MAE, MAPE, R?Score and MSE within Equations 


(10-14) following formulae: 


1 N 
RMSE = —) (P Read = Ppediees) (10) 
n=1 


1v% 2 
MAE = N ; (P psompea _ cared) (11) 
n=1 


1 N 
MAPE = — 
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Pobserved = Poredicted (12) 
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here Popservea ANd Pyredictea Aenote the actual and 
predicted stock values. MAPE less than 5% is 
considered acceptably accurate [29]. 


3 Experimental Results and Discussion 


The analysis of stock price predictions using different 
models is presented. The LSTM model demonstrates 
exceptional performance across all error measures for 
the selected stocks, as shown in Table-3 and visualized 
in Figure-8. The BILSTM model performs well, 
particularly for MSFT and NVDA, as indicated in 
Table-4 and depicted in Figure-9. The RBF model is 
employed and reshapes the data for compatibility with 
the Random Forest Regression (RFR) model, yielding 
results shown in Table-5. The RFR model performs 
accurately during training for AAPL and AMZN but 
struggles with testing, while MSFT and NVDA exhibit 
mixed performance. Overall, the RFR model 
accurately predicts AMZN stock prices but has 
limitations with other stocks in unseen data, as 
summarized in Table-6. The comparison of LSTM and 
BILSTM is demonstrated in Figure-7 which indicates 
improved predictions with LSTM as opposed to 
standard BILSTM. 


3.1. Future Prediction for next 1°, 3" and 7" day 


According to the prediction provided in Figure 8, we 
can easily observe that the proposed SLPE model 
gives the best results among RBF, RFR and BiLSTM. 
On the other hand, BILSTM demonstrates a similar 
trend, with slightly lower predictions compared to 
SLPE but better than RBF and RFR as shown in Figure 
8. Even though the market was facing fluctuations the 
results of the proposed SLPE were best among other 
models. 


Table-3 Error measures on training and testing data using LSTM, BILSTM, RBF and RFR on four stock. 


Training Data Test Data 
Model Stocks RMSE MAE R?Score MAPE RMSE MAE R?Score MAPE 
LSTM AAPL 1.108 0.660 0.998 0.017 2.93 2.237 0.943 0.014 
MSFT 2.017 1.174 0.998 0.015 5.382 4.136 0.967 0.015 
AMZN 1.475 0.945 0.998 0.022 3.457 2.513 0.987 0.019 
NVDA 1.420 0.880 0.998 0.061 7.803 5.873 0.978 0.027 
BILSTM AAPL 1.520 1.018 0.996 0.027 6.958 5.709 0.683 0.038 
MSFT 3.674 2.584 0.995 0.037 21.069 17.044 0.497 0.064 
AMZN 1.870 1.213 0.998 0.028 6.420 5.293 0.958 0.042 
NVDA 1.644 1.229 0.997 0.113 8.893 6.745 0.971 0.031 
RBF AAPL 5.867 4.317 0.950 0.097 47.547 39.083 -1.541 0.131 
MSFT 8.983 6.536 0.972 0.070 6.420 5.293 0.958 0.042 
AMZN 3.057 2.071 0.994 0.038 7.169 5.754 0.947 0.043 
NVDA 3.424 2.229 0.990 0.081 94.115 79.008 -2.187 0.335 
RFR AAPL 0.411 0.215 0.999 0.004 19.000 15.590 -1.349 0.096 
MSFT 0.691 0.370 0.999 0.004 50.209 41.594 -1.834 0.139 
Model Prediction of Test Data AAPL Model Prediction of Test Data AMZN 
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Figure-7: Comparison of actual and predicted stock prices and performance metrics of four stocks between LSTM 
and BILSTM 
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Figure-8: Comparison of actual and predicted stock prices for all four stocks using RFR, RBF, LSTM and BiLSTM 
for one-, three day-, and seven day- ahead predictions 


4 Conclusion 


Inferences on the performance of the designed 
algorithm Single LSTM prediction engine (SLPE) are 
listed as follows: 


e A novel deep learning-based forecast engine 
is presented by using sigmoid as kernel 
function in LSTM layers for short term future 
stock price prediction is employed on S&P 
500 dataset. 


e The efficacy of the proposed models is 
evaluated in terms of performance metrics, 
MAPE, RMSE, R?Score and MAE, 0.014, 
2.93, 0.943, and 2.237, respectively. 


The proposed scheme can utilize other activation 
kernels such as a tanh functions with single LSTM 
for promising results on S&P 500 dataset for 
possible improved future predictions. 
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